klotz: machine learning* + reinforcement learning*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Exploring popular reinforcement learning environments in a beginner-friendly way, focusing on the Q-learning method to solve the 'Frozen Lake' environment.
  2. An article discussing the use of Deep Q-Networks (DQNs) in reinforcement learning, which combines the principles of Q-Learning with function approximation capabilities of neural networks to address limitations of traditional Q-learning such as scalability issues and inability to handle continuous state and action spaces.
  3. This article discusses the process of training a large language model (LLM) using reinforcement learning from human feedback (RLHF) and a new alternative method called Direct Preference Optimization (DPO). The article explains how these methods help align the LLM with human expectations and make it more efficient.
  4. This article discusses the latest open LLM (large language model) releases, including Mixtral 8x22B, Meta AI's Llama 3, and Microsoft's Phi-3, and compares their performance on the MMLU benchmark. It also talks about Apple's OpenELM and its efficient language model family with an open-source training and inference framework. The article also explores the use of PPO and DPO algorithms for instruction finetuning and alignment in LLMs.
  5. 2020-04-12 Tags: by klotz
  6. 2020-01-16 Tags: , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: machine learning + reinforcement learning

About - Propulsed by SemanticScuttle